Below we display our sessionInfo().
sessionInfo(package=NULL)
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] plyr_1.8.4 readr_1.1.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.10 digest_0.6.11 rprojroot_1.2 mime_0.5 R6_2.2.0 xtable_1.8-2 jsonlite_1.4
[8] backports_1.0.5 magrittr_1.5 evaluate_0.10 stringi_1.1.2 rmarkdown_1.3 tools_3.3.2 stringr_1.1.0
[15] hms_0.3 shiny_1.0.0 rsconnect_0.7 httpuv_1.3.3 yaml_2.1.14 base64enc_0.1-3 htmltools_0.3.5
[22] knitr_1.15.1 tibble_1.3.0
Crime has always been a fascinating topic of discussion. It is human nature to pay attention to gruesome murders and moral corruptness. Why? We don’t know. However, we do know that through media outlets, society has developed an ideology that unemployment leads to higher crime rates. Is this true? Through our R Notebook, we try to show the various relationships between crime and unemployment, as well as independent crime rate and state level monetary analysis in an effort to better understand the truth behind these “ideologies.” We will demonstrate step-by-step instructions on how we created various graphs and charts in Tableau as well as explore the extraordinary visualizations produced through Shiny.
Our employment dataset came from the University of Kentucky Center for Poverty Research (UKCPR). We only focused on the columns that were related through a monetary basis (roughly the first 12 columns). Our crime data came from the U.S. Department of Justice, FBI’s annual Uniform Crime Reporting Statistics. We retrieved the data for all the crime categories listed on the website. Both come from very credible and respected institutions. Thus, the data is very reliable for this project.
For both datasets we run the relevant ETL operations. We clean the data by first removing special characters (e.g. - ~) from the column names. We then decide which columns are measures and which are dimensions. For dimensions, we change NA to an empty string, change “&” to “and”, change “:” to “;”. We get rid of " and ’. For measures, we change NA to 0. We get rid of all characters except for numbers and the - sign.
[1] "Population"
[1] "Employment"
[1] "Unemployment"
[1] "Unemployment.rate"
[1] "Marginally.Food.Insecure"
[1] "Food.Insecure"
[1] "Very.Low.Food.Secure"
[1] "Gross.State.Product"
[1] "Number.of.low.income.uninsured.children"
[1] "Percent.Low.Income.Unisured.Children"
[1] "Personal.income"
[1] "Workers..compensation"
state_name state year Population Employment Unemployment Unemployment.rate
Alabama : 5 1 : 5 2010:51 Min. : 564516 Min. : 283744 Min. : 11152 Min. : 2.700
Alaska : 5 10 : 5 2011:51 1st Qu.: 1623796 1st Qu.: 730472 1st Qu.: 54283 1st Qu.: 6.000
Arizona : 5 11 : 5 2012:51 Median : 4382667 Median : 1877812 Median : 157581 Median : 7.300
Arkansas : 5 12 : 5 2013:51 Mean : 6158836 Mean : 2796855 Mean : 244360 Mean : 7.368
California: 5 13 : 5 2014:51 3rd Qu.: 6789176 3rd Qu.: 3235551 3rd Qu.: 288906 3rd Qu.: 8.650
Colorado : 5 14 : 5 Max. :38792291 Max. :17348645 Max. :2244326 Max. :13.500
(Other) :225 (Other):225
Marginally.Food.Insecure Food.Insecure Very.Low.Food.Secure Gross.State.Product Number.of.low.income.uninsured.children
Min. :11.71 Min. : 7.883 Min. :2.036 Min. : 26570 Min. : 1.00
1st Qu.:22.16 1st Qu.:13.051 1st Qu.:4.402 1st Qu.: 76363 1st Qu.: 14.00
Median :25.47 Median :15.394 Median :5.378 Median : 190304 Median : 44.00
Mean :25.45 Mean :15.362 Mean :5.368 Mean : 315502 Mean : 83.01
3rd Qu.:28.50 3rd Qu.:17.266 3rd Qu.:6.159 3rd Qu.: 404486 3rd Qu.: 85.00
Max. :41.08 Max. :25.224 Max. :9.197 Max. :2311616 Max. :843.00
Percent.Low.Income.Unisured.Children Personal.income Workers..compensation
Min. : 0.700 Min. :2.561e+07 Min. : 7907
1st Qu.: 3.000 1st Qu.:6.498e+07 1st Qu.: 36316
Median : 4.100 Median :1.664e+08 Median : 110973
Mean : 4.756 Mean :2.685e+08 Mean : 298914
3rd Qu.: 6.100 3rd Qu.:3.427e+08 3rd Qu.: 246610
Max. :15.000 Max. :1.978e+09 Max. :2443512
[1] "Population"
[1] "Violent.crime.total"
[1] "Murder.and.nonnegligent.Manslaughter"
[1] "Legacy.rape..1"
[1] "Revised.rape..2"
[1] "Robbery"
[1] "Aggravated.assault"
[1] "Property.crime.total"
[1] "Burglary"
[1] "Larceny.theft"
[1] "Motor.vehicle.theft"
[1] "Violent.Crime.rate"
[1] "Murder.and.nonnegligent.manslaughter.rate"
[1] "Legacy.rape.rate..1"
[1] "Revised.rape.rate..2"
[1] "Robbery.rate"
[1] "Aggravated.assault.rate"
[1] "Property.crime.rate"
[1] "Burglary.rate"
[1] "Larceny.theft.rate"
[1] "Motor.vehicle.theft.rate"
State Population Violent.crime.total Murder.and.nonnegligent.Manslaughter Legacy.rape..1
Alabama : 5 Min. : 564554 Min. : 622 Min. : 7.0 Min. : 99
Alaska : 5 1st Qu.: 1623654 1st Qu.: 5386 1st Qu.: 51.5 1st Qu.: 533
Arizona : 5 Median : 4379730 Median : 15452 Median : 160.0 Median :1190
Arkansas : 5 Mean : 6157436 Mean : 23812 Mean : 285.6 Mean :1651
California: 5 3rd Qu.: 6784338 3rd Qu.: 27735 3rd Qu.: 389.0 3rd Qu.:2012
Colorado : 5 Max. :38802500 Max. :164133 Max. :1884.0 Max. :8398
(Other) :225
Revised.rape..2 Robbery Aggravated.assault Property.crime.total Burglary Larceny.theft
Min. : 110.0 Min. : 53 Min. : 432 Min. : 9551 Min. : 1689 Min. : 7273
1st Qu.: 772.8 1st Qu.: 1039 1st Qu.: 3376 1st Qu.: 42299 1st Qu.: 8058 1st Qu.: 29452
Median : 1592.0 Median : 3689 Median : 9550 Median : 125377 Median : 26196 Median : 89103
Mean : 2258.2 Mean : 6862 Mean :14761 Mean : 172925 Mean : 39707 Mean :119222
3rd Qu.: 2518.0 3rd Qu.: 7358 3rd Qu.:18087 3rd Qu.: 204282 3rd Qu.: 47990 3rd Qu.:143460
Max. :11527.0 Max. :58116 Max. :95877 Max. :1049465 Max. :245767 Max. :654626
NA's :153
Motor.vehicle.theft Violent.Crime.rate Murder.and.nonnegligent.manslaughter.rate Legacy.rape.rate..1 Revised.rape.rate..2
Min. : 244 Min. : 99.3 Min. : 0.900 Min. : 9.70 Min. : 13.30
1st Qu.: 3792 1st Qu.: 256.4 1st Qu.: 2.500 1st Qu.:23.90 1st Qu.: 32.08
Median : 8626 Median : 329.5 Median : 4.200 Median :29.00 Median : 38.00
Mean : 13996 Mean : 372.7 Mean : 4.401 Mean :30.74 Mean : 41.34
3rd Qu.: 15407 3rd Qu.: 449.4 3rd Qu.: 5.600 3rd Qu.:36.05 3rd Qu.: 47.55
Max. :168608 Max. :1326.8 Max. :21.800 Max. :89.10 Max. :125.50
NA's :153
Robbery.rate Aggravated.assault.rate Property.crime.rate Burglary.rate Larceny.theft.rate Motor.vehicle.theft.rate
Min. : 9.10 Min. : 60.0 Min. :1524 Min. : 257.2 Min. :1161 Min. : 38.9
1st Qu.: 54.60 1st Qu.:153.4 1st Qu.:2260 1st Qu.: 439.2 1st Qu.:1606 1st Qu.:138.4
Median : 85.10 Median :218.5 Median :2726 Median : 568.3 Median :1938 Median :198.2
Mean : 96.13 Mean :236.9 Mean :2802 Mean : 621.6 Mean :1972 Mean :208.1
3rd Qu.:117.95 3rd Qu.:296.2 3rd Qu.:3305 3rd Qu.: 796.5 3rd Qu.:2289 3rd Qu.:253.1
Max. :715.00 Max. :626.1 Max. :5182 Max. :1157.6 Max. :4082 Max. :835.7
Year
2010:51
2011:51
2012:51
2013:51
2014:51
This is a map of Robbery vs Unemployment per year (between 2010 and 2014). The darker the color, the higher rate of robbery per unemployment there is for each state. Notice how as we go from 2010 to 2014, the Robbery vs Unemployment rate grows spreads from surrounding areas of Nevada, DC, and Louisiana, like a virus!
This is a histogram, with dots showing the burglary rate (# of burglaries/100k people) per year (between 2010 and 2014). The line represents the average burglary rate. Notice how as we go from 2010 to 2014, the burglary rates decrease significantly.
This is a scatterplot showing the relationship between Aggravated Assaults and Robbery per year (between 2010 and 2014). We can see a strong positive linear association between Robberies and Aggravated Assaults through the trend line displayed.
This is a boxplot of property crime rate based on each year. We can see that the property crime rate is slowly decreasing (median wise) as the years progress. DC has the highest property crime rate for all years. They only become outliers for the years: 2012, 2013, and 2014. This means the property crime rate actually decreases for all the other states per year, which makes DC an outlier.
This is a bar chart showing the number of burglaries on each state per year. We manually filtered by big east coast states and west coast states. Notice how West Coast burglaries is significantly higher (around 30k more) than the East Coast burglaries. This could be because of the small number of states classified as “west coast,” giving a large standard deviation.
The Crosstab plots a Crime to Employment Ratio and analyzes every state over the course of 2010-2014. Results indicate that D.C has the highest crime rate by far. Despite changing the sliders multiple times, the area stays with a high KPI. Most nations experience a medium KPI under most settings. The most peaceful nations seem to be Vermont and Wyoming. Vermont especially keeps a very low KPI in almost all settings.
The histogram outlines number of employees aggregate every year from 2010 to 2014. Each year has a different histogram. The bucket size for the histograms are 200000 because of how varied populations are across the states. The biggest increase in overall employment happened between 2010 to 2011, though this growth was slow. After that employment seems to have been increasing very slowly the next 2 years.
The scatter plots a food insecurity ratio on the Y axis and the violent crime rate on the X axis. The only real outlier in terms of results is California which experiences a lower food insecurity but the highest crime rate by far. In terms of interesting results, Mississippi experiences a huge drop in food insecurity after 2010 and then slowly goes back up starting in 2011. It’s interesting because it starts off as the highest in food insecurity and then goes down by a lot, but the violent crime rate stays the same. 2014 seems to indicate that food insecurity on average has gone up by a lot more than in past years. There are very few states with a 10 or less food insecurity ratio.
The boxplot outlines each independent state and their GSP (Gross State Products) over 2010-2014. California has by far the largest GSP with its lowest value of 1953411 being higher than any other state’s GSP . Texas comes in second place followed by New York. These states also have the biggest gaps between the top and bottom of their boxes. The smaller GSP’s are all much more knit together. Vermont has all 5 of the lowest GSP values, but it’s GSP has been going up over the past few years.
The black on the barchart represents the total number of violent crimes done. Each state has an independent graph showcasing the number of crimes done from the years 2010-2014. The red line represents the mean number of violent crimes done in that particular state over the timespan. The blue line represents the difference between the mean and the respective year. The range between the state with the least number of crimes and the highest number of crimes was shocking. California has averaged close to 158000 crimes a year while Vermont has averaged only around 800 a year. Outside of that, the results seemed to indicate that murder rate has stayed fairly consistent in most of the US around the time period.
Through the data, one can see many interesting observations. From Robberies being spread from surrounding areas of Nevada, DC, and Louisiana to other states to how west coast burglaries are significantly more than east coast burglaries, we are able to get a broad idea of the rate at which crimes are happening in recent years. Regardless of how a state is doing, in terms of monetary stability or unemployment rates, we have observed burglary rates and property crime rates decreasing throughout the years. It is safe to assume, that as the years increase, given all situations are static, that the following years to come won’t be a “Prime Time for Crime.”
Here is the website to the shiny application : https://tesseract2010.shinyapps.io/finalproject/